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Reed-Muller Codes Achieve Capacity 
on Erasure Channels 
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Abstract —This paper introduces a new approach to proving 
that a sequence of deterministic linear codes achieves capacity 
on an erasure channel under maximum a posteriori decoding. 
Rather than relying on the precise structure of the codes, this 
method requires only that the codes are highly symmetric. In 
particular, the technique applies to any sequence of linear codes 
where the blocklengths are strictly increasing, the code rates 
converge to a number between 0 and 1, and the permutation 
group of each code is doubly transitive. This also provides a rare 
example in information theory where symmetry alone implies 
near-optimal performance. 

An important consequence of this result is that a sequence of 
Reed-Muller codes with increasing blocklength achieves capacity 
if its code rate converges to a number between 0 and 1. This 
possibility has been suggested previously in the literature but 
it has only been proven for cases where the limiting code rate 
is 0 or 1. Moreover, these results extend naturally to affine- 
invariant codes and, thus, to all extended primitive narrow-sense 
BCH codes. The primary tools used in the proof are the sharp 
threshold property for monotone boolean functions and the area 
theorem for extrinsic information transfer functions. 

Index Terms —affine-invariant codes, BCH codes, capacity- 
achieving codes, erasure channels, EXIT functions, linear codes, 
MAP decoding, monotone boolean functions, Reed-Muller codes. 


I. Introduction 

Since the introduction of channel capacity by Shannon in 
his seminal paper JT), theorists have been fascinated by the 
idea of constructing structured codes that achieve capacity. 
The advent of Turbo codes 0 and low-density parity-check 
(LDPC) codes 0-J3 has made it possible to construct codes 
with low-complexity encoding and decoding that also achieve 
good performance near the Shannon limit. It was even proven 
that sequences of irregular LDPC codes can achieve capacity 
on the binary erasure channel (BEC) using low-complexity 
message-passing decoding GO- For an arbitrary binary sym¬ 
metric memoryless (BMS) channel, however, polar codes 0 
were the first provably capacity-achieving codes with low- 
complexity encoding and decoding. More recently, spatially- 
coupled LDPC codes were also shown to achieve capacity 
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universally over all BMS channels under low-complexity 
message-passing decoding emu 

This article considers the performance of sequences of bi¬ 
nary linear codes transmitted over the BEC under maximum-a- 
posteriori (MAP) decoding. In particular, our primary technical 
result is the following. 

Theorem: A sequence of linear codes achieves capacity on 
a memoryless erasure channel under MAP decoding if its 
blocklengths are strictly increasing, its code rates converge 
to some r G (0,1), and the permutation groupQ of each code 
is doubly transitive. 

Our analysis focuses primarily for the bit erasure rate under 
bit-MAP decoding but can be extended to the block erasure 
rate in some cases. One important consequence of this is that 
binary Reed-Muller codes achieve capacity on the BEC under 
block-MAP decoding. 

The main result extends naturally to Fg-linear codes trans¬ 
mitted over a g-ary erasure channel under symbol-MAP de¬ 
coding. With this extension, one can show that sequences 
of Generalized Reed-Muller codes G2, m over F g also 
achieve capacity under block-MAP decoding. For the class 
of affine-invariant F g -linear codes, which are precisely the 
codes whose permutation groups include a subgroup isomor¬ 
phic to the affine linear group m, one finds that these 
codes achieve capacity under symbol-MAP decoding. This 
follows from the fact that the affine linear group is doubly 
transitive. As it happens, this class also includes all extended 
primitive narrow-sense Bose-Chaudhuri-Hocquengham (BCH) 
codes E3). Additionally, we show that sequences of extended 
primitive narrow-sense BCH codes over F g achieve capacity 
under block-MAP decoding. To keep the presentation simple, 
we present proofs for the binary case and discuss the gener¬ 
alization to F g in Section fV-DI 

These results are rather surprising. Until the discovery of 
polar codes, it was commonly believed that codes with a 
simple deterministic structure might be unable to achieve 
capacity ns, m. While polar codes might be considered 
a counterexample to this statement, they require a somewhat 
complicated design process that is heavily dependent on the 
channel. As such, their ability to achieve capacity appears 
somewhat unrelated to the inherent symmetry in the binary 
Hadamard transform. In contrast, the performance guarantees 
obtained here are a consequence only of linearity and the 
structure induced by the symmetry of the doubly-transitive 

1 The permutation group of a linear code is the set of permutations on code 
bits under which the code is invariant. 
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permutation group. 

Reed-Muller codes were introduced by Muller in flTTl and, 
soon after, Reed proposed a majority logic decoder in fi~8) . 
A binary Reed-Muller code, parameterized by non-negative 
integers to and v, is a linear code of length 2 r " and dimension 
(™) H— • + (™). It is well known that the minimum distance 
of this code is 2 m ~ v CO, m, ED- Thus, it is impossible 
to simultaneously have a non-vanishing rate and a minimum 
distance that scales linearly with blocklength. As such, these 
codes cannot correct all patterns with a constant fraction of 
erasures. Until now, it was not clear whether or not these codes 
can correct almost all erasure patterns up to the capacity limit. 

The possibility that Reed-Muller codes might achieve ca¬ 
pacity under MAP decoding is discussed in several works 0, 
ET1 - II251 . In particular, it has been conjectured in (2P and 
observed via simulations for small block lengths in (22l . (23) . 
Moreover, these codes are believed to behave like random 
codes (22) . (26). For rates approaching either 0 or 1 with 
sufficient speed, it has recently been shown that Reed-Muller 
codes can correct almost all erasure patterns up to the capacity 
limijl (25) . Unfortunately, this approach currently falls short 
if the rates are bounded away from 0 and 1. 

Even after 50 years of their discovery, Reed-Muller codes 
remain an active area of research in theoretical computer 
science and coding theory. The early work in (27) - ll29) cul¬ 
minated in obtaining asymptotically tight bounds (fixed order 
v and asymptotic to) for their weight distribution |30j|. Also, 
there is considerable interest in constructing low-complexity 
decoding algorithms EM35). Undoubtedly, interest in the 
coding theory community for these codes was rekindled by the 
tremendous success of polar codes and their close connection 
to Reed-Muller codes 0, (23) . (36). 

Due to their desirable structure, constructions based on these 
codes are used extensively in cryptography (26), E3-E3- 
Reed-Muller codes are also known for their locality EE). Some 
of the earliest known constructions for locally correctable 
codes are based on these codes E), |46]. Interestingly, local 
correctability of Reed-Muller codes is also a consequence of 
its permutation group being doubly transitive 147) . a crucial 
requirement in our approach. However, a doubly transitive 
permutation group is not sufficient for local testability [48]. 

The central object in our analysis is the extrinsic information 
transfer (EXIT) function. EXIT charts were introduced by 
ten Brink (49) in the context of turbo decoding as a visual 
tool to understand iterative decoding. This work led to the 
area theorem for EXIT functions in (50) which was further 
developed in ED. For a given input bit, the EXIT function 
is defined to be the conditional entropy of the input bit given 
the outputs associated with all other input bits. The average 
EXIT function is formed by averaging all of the bit EXIT 
functions. We note that these functions are also instrumental 
in the design and analysis of LDPC codes (52) . 

An important property of EXIT functions is the EXIT area 
theorem, which says that the area under the average EXIT 
function equals the rate of the code. The value of the EXIT 

- [t requires some effort to define precisely what capacity limit is for rates 
approaching 0 or 1. See ED Definition 2.5] for details. 


function at a particular erasure value is also directly related 
to the bit erasure probability under bit-MAP decoding. For 
a sequence of binary linear codes with rate r to be capacity 
achieving, the bit erasure probability, and therefore the average 
EXIT function, must converge to 0 for any erasure rate below 
1 — r. Since the areas under the average EXIT curves are fixed 
to r, the EXIT functions for these codes must also converge to 
1 for any erasure rate above 1 — r. Thus, the EXIT curves must 
exhibit a sharp transition from 0 to 1, and as a consequence 
of area theorem, this transition must occur at the erasure value 
of 1 — r. 

We investigate the threshold behavior of EXIT functions 
for certain binary linear codes via sharp thresholds for mono¬ 
tone boolean functions [53], [54]. The general method was 
pioneered by Margulis ESI and Russo (56) . Later, it was 
significantly generalized in (57) and (58). This approach has 
been applied to many problems in theoretical computer science 
with remarkable success (58l - (60l . In the context of coding 
theory, Zemor introduced this approach in ED- It was refined 
further in [62], and also extended to AWGN channels in (63). 
For the BEC, ED, ED show that the block erasure rate jumps 
sharply from 0 to 1 as the minimum distance of the code 
grows. However, this approach does not generalize directly to 
EXIT functions and, therefore, does not establish the location 
of the threshold. To show the threshold behavior for EXIT 
functions, we instead focus on symmetry (58) and require that 
the codes of interest have permutation groups that are doubly 
transitive. 

After we completed the first version of this paper (64), we 
discovered that the same approach was being pursued inde¬ 
pendently by Kudekar, Mondelli, §a§oglu, and Urbanke [65]. 

The article is organized as follows. Section QI] includes 
necessary background on EXIT functions, permutation groups 
of linear codes, and capacity-achieving codes. SectionllHldeals 
with the threshold behavior of monotone boolean functions. In 
Section [Iv] as an application of the hitherto analysis, we show 
that Reed-Muller and extended primitive narrow-sense BCH 
codes achieve capacity. Finally, we provide extensions, open 
problems in Section [V] and concluding remarks in Section [VI] 

II. Preliminaries 

This article deals primarily with binary linear codes trans¬ 
mitted over erasure channels and bit-MAP decoding. In the 
following, all codes are understood to be proper binary linear 
codes with minimum distance at least 2, unless mentioned 
otherwise. Recall that a linear code is proper if no codeword 
position is 0 in all codewords. Let C denote an ( N, K) 
binary linear code with length N and dimension I\. The rate 
of this code is given by r = K/N. Denote the minimum 
distance of C by d m \ n . We assume that a random codeword 
is chosen uniformly from this code and transmitted over a 
memoryless BEC. In the following subsections, we review 
several important definitions and properties related to this 
setup. 

Notational convention: 

• The natural numbers are denoted byN = {l,2,...}. 

• For n £ N, let [n] denote the set {1, 2,..., n}. 
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• We associate a binary sequence in {0,1} W with a subset 
of [N] defined by the non-zero indices in the sequence. 
We use this equivalence between sets and binary se¬ 
quences extensively. For example, a sequence 1001100 
is identified by the subset {1,4,5} C [7] and vice versa. 
Similarly, if 101110 is a codeword in C, then we say 
( 1 , 3 , 4,5 }eC. 

• We say that a set A covers set B if B C .4. Also, for 
sequences a,b£ {0,1}^, we write a < b if a,; < 6,; for 
i £ [TV]. Equivalently, a < b if the set associated with b 
covers the set associated with a. 

• For a set A, l^(-) denotes its indicator function. The 
random variable l/.\ is an indicator of some event. 
For example, for random variables X and Y, 1 {x^y} 
indicates the event X ^ Y. 

• For a vector a = (or, < 22 ,..., a/v), the shorthand a Lr ^ i 
denotes (a 1 ,... ,a t -i, a i+ i ,... ,a N ). 

• 0™ and l n denote the all-zero and all-one sequences of 
length n, respectively. 

• A memoryless BEC with erasure probability p is denoted 
by BEC(p). If the erasure probability is different for each 
bit, then we write BEC(p), where p = (pi,... ,p n ) and 
Pi indicates the erasure probability of bit i. 

• For a quantity 9 with index n, we use either 0 n or 
9^ n \ Typically, we write 9 t n ) when using 9 n may cause 
confusion with another quantity such as 0 ,; in the latter 
case we write 9^ n \ 

• For a permutation 7 r: [N] —> [IV] and A C [N], 1 t(A) 
denotes the set {tt(£)\£ £ A}. For sequence a £ {0,1} W , 
7r(a) denotes the length-TV sequence where the 7r(i)-th 
element is a,;. 

• As is standard in information theory, H(-) denotes the 
entropy of a discrete random variable and 7T(-|-) denotes 
the conditional entropy of a discrete random variable in 
bits. 

• All logarithms in this article are natural unless the base 
is explicitly mentioned. 

A. Bit and Block Erasure Probability 

The input and output alphabets of the BEC are denoted 
by X = {0,1} and y = {0,1,*}, respectively. Let X_ = 
(X\,...,Xn) £ X N be a uniform random codeword and 
Y = (Yi,..., Yv) £ y N be the received sequence obtained 
by transmitting X through a BEC(p). In this article, our main 
interest is the bit-MAP decoder. But, we will also obtain some 
results for the block-MAP decoder indirectly based on our 
analysis of the bit-MAP decoder. 

For linear codes and erasure channels, it is possible to 
recover the transmitted codeword if and only if the erasure 
pattern does not cover any codeword. To see this, fix an 
erasure pattern and observe that adding a codeword to the input 
sequence causes the output sequence to change if and only if 
the erasure pattern does not cover the codeword. Similarly, 
it is possible to recover bit i if and only if the erasure 
pattern does not cover any codeword where bit i is non-zero. 
Whenever bit i cannot be recovered uniquely, the symmetry 
of a linear code implies that set of codewords matching the 


unerased observations has an equal number of 0’s and l’s in 
bit position i. In this case, the posterior marginal of bit i given 
the observations contains no information about bit i. 

Let Di \ y N X U {*} denote the bit-MAP decoder for 
bit i of C. For a received sequence Y, if X, can be recovered 
uniquely, then A(Y) = X,. Otherwise, IJ, declares an erasure 
and returns *. Let the erasure probability for bit i £ [IV] be 

Pb,i = Pr(A(X) ^ Xi), 

and the average bit erasure probability be 

1 N 

Pb = 

i —1 

Whenever bit i can be recovered from a received sequence 
Y_ = y, H(Xi\Y_ = y) = 0. Otherwise, the uniform codeword 
assumption implies that the posterior marginal of bit i given 
the observations is Pr (Xi = x \Y = y) = \ and H(Xi\Y_ = 
y) = 1. This immediately implies that 

1 N 

Pb,i = H(Xi | Y), P b = - H(Xi | Y). 

V 2=1 

Let D: y N —> X N U {*} denote the block-MAP decoder 
for C. Given a received sequence Y, the vector D(Y_) is equal 
to X_ whenever it is possible to uniquely recover X_ from Y. 
Otherwise, D declares an erasure and returns *. Therefore, the 
block erasure probability is given by 

P B = Pr(£>(Y) ^ 20- 

Using the set equivalence 

{D(Y) ± X} = (J {Di(Y) + XJ, 

ie[JV] 

it is easy to see that 

Pb,i < Pb, Pb < Pb, Pb < NP b . (1) 

Also, if D declares an erasure, there will be at least d m i n bits 
in erasure. Therefore, 

rfminl{D(r)^X} — DiiYJAXi} ■ 

ie[N] 

Taking expectations on both sides gives a tighter bound on Pb 
in terms of P b , 

N 

Pb < -i—Pb- (2) 

^min 

B. MAP EXIT Functions 

Again, let X_ = (Xi,.... X ; .y) denote a uniformly selected 
codeword from C and Y be the sequence obtained from 
observing X_ with some positions erased. In this case, however, 
we assume Xi is transmitted over the BEC(pi) channel. We 
refer to this as the BEC(p) channel where p = (pi,... ,pn) 
is the vector of channel erasure probabilities. While one 
typically evaluates all quantities of interest at p = (p,... ,p), 
such a parametrization provides a convenient mathematical 
framework for many derivations. 
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The vector EXIT function associated with bit i of C is 
defined by 

hiip) ^ 

Also, the average vector EXIT function is defined by 

1 N 

h ip) 

2=1 

Note that, while we define hi as a function of p for uniformity, 
it does not depend on p t . In terms of vector EXIT functions, 
the standard scalar EXIT functions h{p) and hi (p) (for i £ 
[TV]) are given by 


hi(jp) = hi{p) 


p=(p,...,p) 


h(p) = h(p) 


p=(p,...,p) 


The bit erasure probabilities and the EXIT functions h(p) 
and hi(p) have a close relationship. Observe that 


H{Xi | Y) = Pr(Fi = *)H(Xi\Y^Yi = *) 

+ Pr (Yi = Xi)H(Xi\Y_^i, Yi = X t ) 
= Pr (Yi = *)H(Xi\Y „<). 


Therefore, 


Pb,i(p) = phi(p), P b (p)=ph(p). (3) 

We now state several well-known properties of these EXIT 
functions f50l . lETl . which play a crucial role in the subsequent 
analysis. It is worth noting that the original definition of 
EXIT charts in | [50] focused on mutual information I (X_: Y_) 
while later work on EXIT functions focused on the conditional 
entropy H(X\Y_) ED- In our setting, this difference results 
only in trivial remappings of all discussed quantities. 


Proposition 1: For a code C on the BEC(p) channel, the EXIT 
function associated with bit j satisfies 


hi{p) 


dH(X\Y(p )) 

dpi 


For a parametrized path p(t) = (pi(i ),... ,p n (t)) defined for 
t £ [ 0 , 1 ], where p'ft) is continuous, one finds 


# oam)) - Hwm 


N 


^2 h i(p(t))p'i(t) dt. 


Proof: This result is implied by the results of both f50) 
and |5TI . For completeness, we repeat here the proof from f5Ti 
Theorem 2] using our notation. 

For the first statement, we start by using chain rule of 
entropy to write 


H(X\Y(p)) = h(x 1 \Up)) + H(2C^i\Xi,Y(p)) . 

Then, we observe that 

H(2L«\Xi,Y(g)) =H(x^i\Xi,]L li (2j) , 

is independent of pi. Since 

H{Xi\Y{p)) = P r (Xi = *)tf Y, = *) 

+ Pr (Y, = XJH^XilY^^Yi = X 
= Pi H(x i \Y„ i (pj) , 


we find that 


dH(Xj\Y(p )) 

dpt 


= H(Xi\Y^{pj) 


hi(p). 


The second statement now follows directly from vector calcu¬ 
lus. ■ 

The following sets characterize the EXIT functions hi and 
we will refer to them throughout the article. 


Definition 2: Consider a code C and the indirect recovery of 
Xi from the subvector Y^ (i.e., the bit-MAP decoding of Y, 
from Y when Y, = *)■ For i £ [A ; 'j, the set of erasure patterns 
that prevent indirect recovery of X,; under bit-MAP decoding 
is given by 

[X]\{j}|3.BC [JV]\{j},BU{t}eC,BCyl}. 

For distinct i,j £ [iV], the set of erasure patterns where the 
j-th bit is pivotal for the indirect recovery of X, is given by 

djSli = {AC [N]\{i} | A\{j} i £ H t } . 


These are erasure patterns where Xi can be recovered from 
Y_^i if and only if Y,- 7 ^ * (i.e., the j-th bit is not erased). 
Note that djfli includes patterns from both O, and Uf 


Intuitively, Q, is the set of all erasure patterns that cover 
some codeword whose j-th bit is 1. Also, since the minimum 
distance of C is at least 2 by assumption, the decoder can 
always recover bit i indirectly if no other bits are erased. Thus, 
fij does not contain the empty set. For j £ [iV]\z, the set djil, 
characterizes the boundary erasure patterns where flipping the 
erasure status of the j-th bit moves the pattern between fl; 
and Oj. 


Proposition 3: For a code C on the BEC(p) channel, we have 

the following explicit expressions. 

a) For bit i, the EXIT function is given by 

k i(p)= y n* n ( i_ w)- 

AGQiteA t£A c \{i} 


b) For distinct i and j, the mixed partial derivative satisfies 

d 2 H(X\Y(p)) dhi{p) 


dp id pi 


e n« n 


AedjUiCeA <£A c \{i} 

Proof: For a), the definition of h, implies 


hi{p) = H(Xi\Y^i(p~i)) 

= Y Pr (^-i = y 

v^y N ~ 1 


.)H(x i \Y^ i = i L 


,)■ 


Note that either ye, = xe or ye = *. Let A C [iV]\{*} be the 
set of indices where ye = * so that 

Pr (X~i = V^) = n P* II - Pt)- 

t^A rGA-=\{i} 


If A U {*} covers a codeword in C whose j-th bit is non¬ 
zero, then bit-MAP decoder fails to decode bit j. Also, since 
the posterior probability of Xi given Y_ r ^ i = y is uniform, 

H(Xi\Y^ = yJ = l. 
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If Au{j} does not cover any codeword in C with non-zero 
bit i, then the MAP estimate of X, given Y_^ ji = y is equal 
to X, and H{X i \Y^ i = y~.) = 0. 

Thus, the EXIT function hi(p) is given by summing over 
the first set of erasure patterns where the entropy is 1. This 
set is precisely f f, the set of all erasure patterns that cover a 
codeword whose z-th bit is non-zero. 

For b), we evaluate the partial derivative using the explicit 
evaluation of hi(p) from part a). Suppose A £ Qj. To simplify 
things, we handle the two groups separately. 

If A U {j} £ fii and A\{j} £ f i i? then we observe that 

n* n (!-«) 

B£{AU{j},A\{i}} <65°\{i} 

= n p* n ~p() 

Z^A\{j} ZeAc\{i,j} 

is independent of the variable pj. Thus, its partial derivative 
with respect to pj is zero. 

On the other hand, if A U {j} £ fli but A\{j} ^ O,, then 
j £ A. In this case, the contribution of A to hi(p ) can be 
written as 


hi,A(p ) = I| P* II “ Pt)- 

ee A t£A c \{i} 

Since j £ A , we find that 

dh iyA (p) 


dpj 


= n pe n ( i_ w) 

zeA\{j} reA-=\{i} 


(4) 


and, since the derivative is zero for patterns in the first group, 
we get 

dhi(p ) ^ dhi' A (p) 


E 


^ Pj AefBGfii 


(5) 


We can also rewrite (|4]) as 

dh itA (p) 


dpj 


e n« n (6) 

Be{ A u{j}, A \{j}}teB < eb '\{»} 


where the effect of pj is removed by summing over A IJ {j} 
and A\{j}. Substituting Q into ([5]) gives the desired result 
because djfli is equal to the union of {A £ fl ( |A\{j} ^ SI,} 
and {A Q t \A U {j} £ flj}. ■ 

The following proposition restates some known results in 
our notation. The area theorem, stated below as c), first 
appeared in l50l Theorem 1], and the explicit evaluation of 
h t (p), stated below in a), is a restatement of ll52l Lemma 
3.74(iv)]. 


Proposition 4: For a code C and transmission over a BEC, 
we have the following properties for the EXIT functions, 

a) The EXIT function associated with bit i satisfies 

hi( P )= P lAl a-p) N ~^ lAl - 

AEfi; 


b) For j £ [iV]\{z}, the partial derivative satisfies 


dhi(p) 


dpj 


p=(p,...,p) 


= £ pW(i_p)"-i-W 


AedjZii 


c) The average EXIT function satisfies the area theorem 

f 1 I< 

1 h ^ d ”=N- 

Proof: The first two parts follow from Proposition [3] For 
the third part, we use Proposition |TJb) with the path p(t) = 
(t,..., t). This gives 


Also, H(X\Y(1)) = H(X) = K and H(X\Y_(0)) = 0. 
Combining these observations gives the desired result. ■ 

Since the code C is proper by assumption, f f is non-empty 
and, in particular, [X]\{*} £ fl;. Thus, /i t is not a constant 
function equal to 0 and hi( 1) = 1. Since the minimum distance 
of the code C is at least 2 by assumption, i t, does not contain 
the empty set. This implies that hi is not a constant function 
equal to 1 and that hi(0) = 0. As such, hi is a non-constant 
polynomial. Also, hi is non-decreasing because Proposition 
|4jb) implies that dhi/dp > 0. It follows that hi is strictly 
increasing because a non-constant non-decreasing polynomial 
must be strictly increasing. 

Consequently, the EXIT functions hfp), and therefore hip), 
are continuous, strictly increasing polynomial functions on 
[0,1] with h( 0) = hi( 0) = 0 and h( 1) = hf 1) = 1. 

The inverse function for the average EXIT function is 
therefore well-defined on [0,1]. For t £ [0,1], let 

p t = = inf{p S [0,1] | h(p) > t}, (7) 

and note that h(pt) = t. 

C. Permutations of Linear Codes 

Let .S'v be the symmetric group on N elements. The 
permutation group of a code is defined as the subgroup of 
Sn whose group action on the bit ordering preserves the set 
of codewords ll66l Section 1.6]. 

Definition 5: The permutation group Q of a code C is defined 
to be 


H{X\Y ( 1 )) - H(X\YW) = f 

Jo 



Q = {it £ Sn | tt(^4) 6 C for all A £ C}. 

Interestingly, for binary linear codes, the permutation group 
is isomorphic to the group of weight-preserving linear trans¬ 
formations of the code JT9] Section 8.5], US Section 7.9], 

ED. 

Definition 6: Suppose Q is a permutation group. Then, 

a) Q is transitive if, for any i,j £ [X], there exists a 
permutation n £ Q such that 7r(i) = j, and 

b) Q is doubly transitive if, for any distinct i,j, k £ [X], there 
exists a 7 t £ Q such that n(i) = i and n(j) = k. 

Note that any non-trivial code (i.e., 0 < r < 1) whose 
permutation group is transitive must be proper and have 
minimum distance at least two. 

In the following, we explore some interesting symmetries 
of EXIT functions when the permutation group of the code is 
transitive or doubly transitive. 
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Proposition 7: Suppose the permutation group Q of a code C 
is transitive. Then, for any i £ [AT], 

h(p) = hi(p) for 0 < p < 1. 


Proof: Since Q is transitive, for any i,j £ [JV], there 
exists a permutation n such that tt(i) = j. This will allow us 
to show that there is a bijection between f \ and f 1j induced 
by the action of 7r on the codeword indices. To do this, we 
first show that A £ f2j implies it {A) £ Clj. 

Since A £ fby definition, there exists B C A such that 
BVJ{i} £ C. Since n £ Q, 7r(f3U{*}) £ C. Also, 7r(f?U{i}) = 
n(B) U {j} and n(B) C tt(A). Consequently, tr(A) £ f lj. 

Similarly, if A £ Clj, then 7r _1 (A) £ f2j. Thus, there is 
a bijection between f f and flj induced by 7r. This bijection 
also preserves the weight of the vectors in each set (i.e., \A\ = 
|7t(A)|). 

Since Proposition 0 a) implies that hfp) only depends on 
the weights of elements in fit follows that h t (p) = hj(p). 
This also implies that h(p) = hi{p) for all 0 < p < 1. ■ 


Proposition 8: Suppose that the permutation group lj of a 
code C is doubly transitive. Then, for distinct t, ],k £ [AT], 
and any 0 < p < 1, 


dhiip) 


dp 


dhifp) 


3 P 


p=(p,...,p) 


dpk 


p=(p,...,p) 


Proof: Since Q is doubly transitive, there exists a permu¬ 
tation 7 t £ Q such that 7r(*) = i and 7r(j) = k. Suppose 
A £ djtti. Then, by definition, either 1 ) A £ fli and 
A\{j} f: f li or 2) A U {j} £ fli and A ^ In either 
case, we claim that ir(A) £ dkfii- We prove this for the first 
case. The proof for the second case can be obtained verbatim 
by replacing A with A U {j}. 

Suppose A £ and A\{_j} ^ f li. Since n £ Q and 
7 r(z) = i , 7t(A) £ f li. Also, 7t(A\{j}) ^ f Ip, otherwise, 
A\{j} = 7r _1 (7r(A\{j})) £ f f gives a contradiction. Finally, 
7r(A\{j}) = 7 t(A)\{/c} implies that 7r(A) £ Similarly, 

one finds that A £ dk^li implies 7t _1 (A) £ djfli. 

Since Proposition HJb) implies that f^ i |p=(p,...,p) only de¬ 
pends on the weights of elements in djil t and |A| = |7r(A)|, 
we obtain the desired result. ■ 


D. Capacity-Achieving Codes 

Definition 9: Suppose { C n } is a sequence of codes with rates 
{r ra } where r n —> r for r £ (0,1). 

a) {C n } is said to be capacity achieving on the BEC under 
bit-MAP decoding, if for any p £ [0,1 — r), the average 
bit-erasure probabilities satisfy 

lim pf' n \p) = 0. 

n—f oo 

b) {C„ } is said to be capacity achieving on the BEC under 
block-MAP decoding, if for any p £ [0,1 — r), the block- 
erasure probabilities satisfy 

lim p{f\p) = 0. 

n—f oo 

Note that in the definition above, we do not impose any 
constraints on the blocklength of the code C n . 



Fig. 1. The average EXIT function of the rate-1/2 Reed-Muller code with 
blocklength N. 


The following proposition encapsulates our approach in 
showing capacity achievability. It naturally bridges capacity- 
achieving codes, average EXIT functions, and the sharp tran¬ 
sition framework presented in the next section that allows us 
to show that the transition widt!@ of certain functions goes to 
0. The average EXIT functions of some rate-1/2 Reed-Muller 
codes are shown in Figure Q] Observe that as the blocklength 
increases, the transition width of the average EXIT function 
decreases. According to the following proposition, if this width 
converges to 0, then Reed-Muller codes achieve capacity on 
the BEC under bit-MAP decoding. 


Proposition 10: Let {C n } be a sequence of codes with rates 
{r n } where r n —> r for r £ (0,1). Then, the following 
statements are equivalent. 

SI: {C n } is capacity achieving on the BEC under bit-MAP 
decoding. 

S2: The sequence of average EXIT functions satisfies 


lim h {n \p) 

n—too 


0 if 0 < p < 1 — r, 
1 if 1 — r < p < 1. 


S3: For any 0 < £ < 1/2, 

lim (p^X -pi n) ) = 0, 

n—> oo V / 


where p i ) n ' > is the functional inverse of given by ©. 
Proof: See Appendix [J 

The equivalence between the first two statements is due 
to the close relationship between the bit erasure probability 
and the average EXIT function in (0, while the equivalence 
between the last two statements is a consequence of the area 
theorem in Proposition 0c). ■ 

While the above result appears deceptively simple, our 
approach is successful largely because the transition point 
of the limiting EXIT function is known a priori due to the 
area theorem. Even though the sharp transition framework 
presented in the next section is widely applicable in theoretical 


’Defined as the interval length where the function goes from e to 1 — e. 
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computer science and allows one to deduce that the transition 
width of certain functions goes to 0, establishing the existence 
of a threshold and determining its precise location if it exists 
can be notoriously difficult 1681 - 1701 . 


One elegant way to obtain such inequalities is based on 
discrete isoperimetric inequalities |53l , fl54fl . First, let us define 
the function gn: {0,1} M — > N U {0}, which quantifies the 
boundary between SI and Sl c , 


III. Sharp Thresholds for Monotone Boolean 
Functions via Isoperimetric Inequalities 
As seen in Proposition [10] the crucial step in proving the 
achievability of capacity for a sequence of codes is to show 
that the average EXIT function transitions sharply from 0 to 
1. From the explicit evaluation of h, in Proposition [4] a), it is 
clear that the set f \ defines the behavior of h,. Indeed, these 
sets play a crucial role in our analysis. 

In this section, we treat the sets f \ and d :i S l, from Defini¬ 
tion [2] as a set of sequences in {0, l}^' 1 , since index i is not 
present in any of their elements. This occurs because hi(p ) is 
not a function of pi. To make this notion precise, we associate 
A C [A r ]\{i} with $i(A) G {0, l}^ -1 , where bit i of 4>i(A) 
is given by 


[MA)]e = 


1a(£) 
i a(£ +1) 


if £ < i, 
if l > i. 


Now, define 


G {0, l}^ -1 | A G f2»}, (8) 

d 3 Sf = {$i(A) G {0, l}^- 1 | A G dp.,}. 


Whenever we treat f2,; and dj S l, as sequences of length N — 1, 
we refer to them as O' and ()}.} to avoid confusion. 

Consider the space {0,1} M with a measure y p such that 

p p (n) = 53pN(l-p)"-N, for SI C {0,1} M , 

where the weight |x| = x\ + ■ ■ ■ + Xm is the number of l’s 
in x. We note that hflp) = fi p (O') with M = N — 1. 

Recall that for x, y G {0,1} M , we write x < y if Xi < yi 
for all ie[M], 

Definition 11: A non-empty proper subset O C {0,1} M is 
called monotone if x G O and x<y, then y G O. 

Remark 12: If the bit-MAP decoder cannot recover bit i from 
a received sequence, then it cannot recover bit i from any 
received sequence formed by adding additional erasures to 
the original received sequence. This implies that the set O' 
is monotone. 


Monotone sets appear frequently in the theory of random 
graphs, satisfiability problems, etc. For a monotone set O, 
H P (Si) is a strictly increasing function of p. Often, the quantity 
p,p(0) exhibits a threshold type behavior, as a function of p, 
where it jumps quickly from 0 to 1. One technique that has 
been surprisingly effective in showing this behavior is based 
on deriving inequalities of the form 

dyppl) > _ y p p)). ( 9 ) 

If w is large, then the derivative of // p (f)) will be large when 
y p (Sl) is not close to either 0 or 1. In this case, y p (Sl) must 
transition from 0 to 1 over a narrow range of p values. 


gn(x) = 


{yG Sl c \ dn(x, y) 
0 


1 } 


if x G n, 
if x } Q, 


( 10 ) 


where du is the Hamming distance. Surprisingly, for a mono¬ 
tone set Si, the derivative dp p {Sl)/dp can be characterized 
exactly by go, according to the Margulis-Russo Lemma |55j, 

m- 


dp p {Sl) If 
—— =-] 9n(x)v P (dx). 

Observe that y p (Sl) + fj, p (Sl c ) = 1 for any 0 < p < 1. For 
a monotone set SI, as we increase p, the probability from Sl c 
flows to SI. Intuitively, Margulis-Russo Lemma says that this 
flow depends only on the boundary between SI and Sl c . To 
obtain inequalities of type l[9|. one approach is to find a lower 
bound on go that holds whenever it is non-zero l55l . |[56ll . 

These techniques were introduced to coding by Tillich and 
Zemor to analyze the block error rate of linear codes under 
block-MAP decoding 1611 . j62l . In that case, the minimum 
non-zero value of gn is proportional to the minimum distance 
of the code. Unfortunately, for the bit-MAP decoding problem 
we consider, the minimum non-zero gn may be small (e.g., 
1) even when the minimum distance of the code is arbitrarily 
large. This is discussed further in Section |V-AI To circumvent 
this, we discuss another approach, which requires a different 
formulation of the Margulis-Russo Lemma. We begin with a 
few definitions. 


Definition 13: Let SI be a monotone set and let 

djSl — {x G {0,1} M | lfi(x) ^ 1q(x«)} , 

where xf^ is defined by x'p = xn for t j and x'p = 1 —Xj. 
Let the influence of bit j G [M] be defined by 

Ip\si)^p p {d j Sl) 

and the total influence be defined by 

M 

I W (S1) = ^I ( W (S1). 

e=i 


The Margulis-Russo Lemma can also be stated in terms of 
the total influence. 


Theorem 14 ( l53l Theorem 9.15]): Let SI be a monotone set. 
Then, 


_ JT(P) (fl). 
dp 

Remark 15: Note that we have already seen Theorem [14] in 
the context of EXIT functions. When M = N — 1, it is easy 
to see from Proposition 0] that 


hip) = » P m, = 


dhflp) 


dpj 


T 


p=(p,...,p) 


where 











•/ 

3 = 


I j if j < i, 
Ii +1 if j > *• 

Therefore, Theorem [14] is equivalent to 


dhj(jp) 

dp 


= E 

je[N]\{i} 


dhi(p) 


dpj 


p=(p,...,p) 


( 11 ) 


a straightforward result from vector calculus since hi does not 
depend on pi. 


The advantage of using influences over go is that with 
“sufficient symmetry” in f2, it is possible to show threshold 
phenomenon without any other knowledge about fl The 
following theorem illustrates the power of symmetry. Our 
proof hinges on this result. 


Theorem 16: Let SI be a monotone set and suppose that, for 
all 0 < p < 1, the influences of all bits are equal I^' 1 (SI) = 

a) Then, there exists a universal constant C > 1, which is 
independent of p, $7, and M, such that 

> C(logM)/r p (f2)(l - p p m, 


for all 0 < p < 1. 

b) Consequently, for any 0 < e < 1/2, 


Pl-e 


2 log ^ 

n r < --— 

Pe - C log M 


where p t = inf{p £ [0,1] | /i p (S7) > t} is well-defined 
because /r p (S7) is strictly increasing in p with /ro(S7) = 0 
and /ti($7) = 1. 


Proof: See (58), El, El- E3 Section 9.6] for details. 

In this form (i.e., by assuming all influences are equal), this 
result first appeared in lf58l . However, this theorem can be 
seen as an immediate consequence of the earlier result in El 
Corollary 1.4]. The constant C was later improved in l72l . 
From the outline in |]53] Exercise 9.8], one can verify this 
theorem for C = 1. 

For the historical context, the study of influences for boolean 
functions was initiated in a 1987 technical report that led 
to l73l . Shortly after, 1741 applied harmonic analysis to obtain 
some powerful general theorems about boolean functions. 
These results were subsequently generalized in El . ■ 

For the sets f1', such a symmetry between influences is 
imposed by the doubly transitive property of the permutation 
group of the code according to Proposition [8] 


Theorem 17: Let {C n } be a sequence of codes where the 
blocklengths satisfy N n —> oo, the rates satisfy r n —> r, and 
the permutation group Q ( ' n> (of C n ) is doubly transitive for 
each n. If r £ (0,1), then {C n } is capacity achieving on the 
BEC under bit-MAP decoding. 

Proof: Let the average EXIT function of C„ be h^ n \ The 
quantities N, Q , h, hi, f1', and p t that appear in this proof are 
all indexed by n; we drop the index to avoid cluttering. Fix 
some i £ [iV]. Since Q is transitive, from Proposition [T] 

h{p) = hi{p), for all p £ [0,1]. 


Consider the sets O' from Definition [2] and ©. and let M = 
N — 1. Observe that, from Proposition |4] 


hi(p) = h P m, 4 p) (o') = 


dhifp) 


dpp 


p=(p,...,p) 


where j’ is given in El . Since Q is doubly transitive, from 
Proposition [8] 


I ( f ) m=I i ^ ) m for all j,ke[N-l]. 
Using Theorem [T6l we have 


2 log 

p l-e Pe < c log ^ _ ^ > 

where p t is the functional inverse of h from (|7}. Since N —> oo 
from the hypothesis. 


lim (pi_ e - p E ) = 0. 
n—> oo 

Therefore, from Proposition [TO] { C r ,} is capacity achieving on 
the BEC under bit-MAP decoding. ■ 

We now focus on the block erasure probability. Recall from 
(Q] and ([2l that the block erasure probability satisfies the upper 
bounds 

Pb < Pb < NP b . 

^min 

Thus, if P b —> 0 with sufficient speed, then Pb —> 0 as well. 
Using @. one can derive the upper bound (see Lemma [29] in 
Appendix [TT] for a proof) 

< exp (-w\p 1/2 - <5]), 

where P1/2 £ [0,1] is defined uniquely by p Pl/2 (Q) = 1/2. 

For p < 1 — r, the factor of \og(N — 1) in Theorem [171 
determines the decay rate of h with N, and consequently the 
decay rate of P b . The following theorem shows that, if d m \ n 
satisfies log(d m i n )/log(fV) —> 1, then this decay rate is also 
sufficient to show that Pb —> 0. 


Theorem 18: Let { C n } be a sequence of codes where the 
blocklengths satisfy N n —> 00 and the rates satisfy r n —> r 
for r £ (0,1). Suppose that the average EXIT function of C n 
also satisfies, for 0 < p < 1, 

dh{T ] ){p) > Clog(N n )h^(p)(l - h^ n \p)), 
dp 

where C > 0 is a constant independent of p and n. If the 
minimum distances satisfy 


lim 

n—>-oo 


lQ g d ml 

log N n 


= 1 , 


then {C n } is capacity achieving on the BEC under block-MAP 
decoding. 


Proof: See Appendix 1II-A1 ■ 

If dm in does not grow rapidly enough (e.g., sequences of 
Reed-Muller codes with rates r n —> r £ (0,1) have d m i n = 
0(VN ) for any S > 0), then the previous theorem does 
not apply. Fortunately, it is possible to exploit symmetries, 
beyond the double transitivity of the permutation group, to 
obtain inequalities like <(9]) that grow asymptotically faster than 
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log (IV) 1731 . In particular, one obtains inequalities of type ([9}. 
with factors of higher order than log (IV), for all p except a 
neighborhood around 0 and 1 that vanishes as N —> oo. The 
following theorem shows that this is sufficient to show that 
Pb 0 without imposing requirements on d ]n \ n . 

Theorem 19: Let { C n } be a sequence of codes where the 
blocklengths satisfy N n -+ oo and the rates satisfy r n —> r 
for r £ (0,1). Suppose that the average EXIT function of C n 
also satisfies, for a n < p < b n , 

> w n \og(N n )h^(p)( 1 - ftW(p)), 

where w n —> oo, a n —> 0 , b n —> 1 and 0 < a n < b n < 1 . 
Then, { C n } is capacity achieving on the BEC under block- 
MAP decoding. 

Proof: See Appendix lll-BI ■ 

IV. Applications 
A. Affine-Invariant Codes 

Consider a code C of length N = 2 m and the Galois 
field F n. Let 0: [IV] — > Fat denote a bijection between the 
elements of the field and the code bits. Take a pair (3 ,7 £ Fat 
with /3 0 and define 7 ra l7 £ Sn such that 

^, 7 (f)=0- 1 (/30(£) + 7 ). 

Note that 7 Tg i7 is well-defined since 0 is bijective and 3 f 0, 
and observe that ttp ini o 7 ra 2>72 = 7t/3i^ 2 ,/3i 72 + 71 - As such, the 
collection of permutations 71 , 3 17 forms a group. Now, the code 
C is called affine-invariant if its permutation group contains 
the subgroup 


Consider the set of m variables, Xi,..., x m . For a mono¬ 
mial x l f ■ ■ ■ x l ™ in these variables, define its degree to be 
ii + ■ —f i m . A polynomial in to variables is the linear com¬ 
bination (using coefficients from a field) of such monomials 
and the degree of a polynomial is defined to be the maximum 
degree of any monomial it contains. It is well-known that the 
set of all m-variable polynomials of degree at most v is a 
vector space over its field of coefficients. In this section, the 
coefficient field is the Galois field F 2 and the vector space of 
interest is given by 

P(m,v) = spanjx* 1 | fi4-b t m <v,U£ {0,1}}. 

For a polynomial / £ P{m,v), f(x ) £ {0,1} denotes the 
evaluation of / at 2 G {0, l} m . 

Let the elements of the vector space {0, l} m over F 2 
be enumerated by e lt e 2 ,..., e N with e N = 0 m . For any 
polynomial / £ P(m,v), we can evaluate / at e 4 for all 
i £ [IV]. Then, the code RM(r,m) is defined to be the set 

RM(u, to) = {(/(ej,..., f(e N )) \ f £ P(m, u)}. 

Lemma 20 ( lU3l Corollary 4]): The permutation group Q of 
RM(u,to) is doubly transitive. 

Proof: Take any distinct i,j,k £ [IV]. Below, we will 
produce a 7r £ Q such that n(i) = i and 7r(j) = k. 

It is well known that for any vector space with two ordered 
bases (m : ,..., u m ) and (yf ,..., yf m ), there exists an invertible 
to x to matrix T such that 

Mj = Tu', for all i £ [to]. 


{tt/3, 7 £ S N \ 13,1 £ Fat, (3 / 0}, 


for some bijection 0 l66i Section 4.7]. 

Affine-invariant codes are of interest to us because their 
permutation groups are doubly transitive. To see this, consider 
distinct i,j,k £ [IV] and choose f3,i £ Fjv where 


0(1) - 0(fc) 

e(O-0 CiV 


7 = 0 ( 1 ) 


/ 0(fc) — 0(j) \ 
V 0(i) - 0(j )) 


and observe that 77 , 3 ;7 (i) = i and 7r^ j7 (j) = k. 

Thus, by Theorem [IT] a sequence of affine-invariant codes 
of increasing length, rates converging to r £ (0,1), achieve 
capacity on the BEC under bit-MAP decoding. Some examples 
of great interest include generalized Reed-Muller codes IT2l 
Corollary 2.5.3] and extended primitive narrow-sense BCH 
codes ll66l Theorem 5.1.9]. Below, we discuss Reed-Muller 
and BCH codes in more detail. 


Note that since i,j, k are distinct, e 0 — e ?: ^ 0 m and e k — e 4 ^ 
0 m . Therefore, there exists an invertible to x to binary matrix 
T such that 


T{ej - ef) = e k - e v 

For such a T, we construct 7r: [N] —> [IV] by defining 7 r(f) = 
i' for the unique /;' such that 


=T{e i -ef) + e i . 

Note that n £ Sn since T is invertible. Also, by construction, 
7 r(I) = i and 7 r(j) = k. 

It remains to show that 7 t £ Q. For this, consider a codeword 
in RM(?i,m) given by / £ P{m,v). It suffices to produce a 
g £ P{m,v) such that g{e n ^) = ffef) for all l S [ N ]. Let 


B. Reed-Muller Codes 

For integers v, m satisfying () < v < to, the Reed- 
Muller code RM(u,to) is a binary linear code with length 
N = 2 m and rate r = 2~ m ((£)+■■■ + (”)). Although it 
is possible to describe these codes from the perspective of 
affine-invariance [12;, Corollary 2.5.3], below, we treat them 
as polynomial codes ll76l . This provides a far more powerful 
insight to their structure fl2l . 1771 . 


g(xi,...,Xm) = f(T 1 [xi,...,x m ] T -T 1 e i + e i ), 

and note that degree(/) = degree(p), g(e v{£) ) = ffef). 
Thus, we have the desired g £ P(m,v). Hence, Q is doubly 
transitive. ■ 

There is also a sequence of {RM(u m ,m)} codes with 
increasing blocklengths and rates approaching any r £ (0,1). 
To construct such a sequence, fix r £ (0,1) and let {Zfj be 
an iid sequence of Bernoulli(l/2) random variables. Then, the 
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rate of the RM (v m ,m) code is 



— Pr(Zi + • • • + Z m < Vm) 


= pr / Z 1 - l 1 + ... + Z m -\ < v m -f 

V V m / 4 ~~ \fmjl 

Thus, by central limit theorem, if we choose 


a 5 G P(m,v) where 5 (e^ w ) = /(e^) for l G [N], The 
desired g is given by 5 ( 2 : 1 ,... ,x m ) = /(T _ 1 [ x 1} ... ,x m ] T ), 
by observing that degree(p) = degree(/) and g(e N ) = 

fir- 1 cr) = f(e N ). m 

Theorem 23: For any r G (0,1), the sequence of codes 
{RM(u m , to)}, with 

v m = max < 


y + ^Q-\l-r) 


,0 


v m = max 


TO 

2 " 


+ Vr(i->-) 



has rate r m —>■ r and is capacity achieving on the BEC under 
block-MAP decoding. 


then the rate of RM(u m ,ro) satisfies r m —> r as m —> 00. 
Here, 

/ OO 

e T /2 dr. 

Theorem 21: For any r G (0,1), the sequence of codes 
{RM(r m ,rn)} with 

v m = max 

has rate r m —> r and is capacity achieving on the BEC under 
bit-MAP decoding. 

Proof: This result follows as an immediate consequence 
of Lemma [20] and Theorem [17] ■ 

We now analyze the block erasure probability of Reed- 
Muller codes. The minimum distance of Reed-Muller codes 
is too small to utilize Theorem [l 8 l Thus, we use Theorem IT9l 
instead. 

For the code RM(r, to), consider the set f l' N from Defini- 
tion[ 2 ]and ©. Let Gn be the permutation group of Cl' N defined 
by 

Gn — {tt G S N -1 I 7 r(a) G Cl' N for all a G 12'^}. 

Lemma 22: For the permutation group Gn defined above, 
there is a transitive subgroup isomorphic to GL(to,F 2 ), the 
general linear group of degree to over the Galois field F 2 . 

Proof: For a given T G GL(to,F 2 ), associate ttt G 
Sn-i, where 

7 t t (£) = £', where e e , = Te^. 

Note that ttt is well-defined since T is invertible. Moreover, 
it is easy to check that o ttt 2 = for T\,T 2 G 

GL(to,F 2 ). As such, the collection of permutations 

PL = { 7 tt G Sn~i I T G GL(?n,F2)} 

is a subgroup of Sn- 1 isomorphic to GL(to,F 2 ). Also, for 
i,j G [AT—1], there exists T G GL(to,F 2 ) such that =Te i . 
For such a T, 7 tt(*) = j. Therefore, PL is transitive. 

It remains to show that PL Q Gn - F° r this, associate ttt G PL 
with tt' t G S n where 

7 = 7 tt{P) for £ G [N — 1], ^(N) = N. 

Also, it is easy to show that 7 tj- G G 1 if tt}, G G , the 
permutation group of RM(r, to). To see that ~' T G G, consider 
a codeword given by / G P(m,v). It suffices to produce 


TO 


+ Vr(i-t) 


,0 


Proof: Let the EXIT function associated with the last 
bit and the average EXIT function of the code RM(r m ,rn) 
be Jin and h, respectively. Since the permutation group of 
RM(u m ,ro) is transitive by Lemma [20l from Proposition [7] 
h = Hn- Moreover, by Lemma [22] Gn contains a transitive 
subgroup isomorphic to GL(to, F 2 ). 

Now, we can exploit the GL(to, F 2 ) symmetry of 12 jv within 
the framework of |75l . In particular, 175] Theorem 1, Corollary 

4.1] implies that there exists a universal constant C > 0, 
independent of to and p, such that 

db n ip) clog ( log Nm ) log ( N m )h N (p )(1 - h N (p)), 

dp 

for 0 < a m < p < b m < 1, where N m = 2 m and a m —> 0, 
b m —> 1 as to —> 00. Since h = Hn, Theorem [T9] implies 
that {RM(ti m ,rn)} is capacity achieving on the BEC under 
block-MAP decoding. ■ 

From this, we see that the block erasure probability goes to 
0 for p < 1 — r. For p > 1 — r, the average EXIT function h(p) 
is bounded away from 0. Thus, Theorem [2T] implies that the 
bit erasure probability ph(p) is bounded away from 0 but not 
converging to 1. The block erasure probability does converge 
to 1, however. This follows from the result in l62l because 
the minimum distance of the code RM(r m ,m) goes to 00 as 

TO —> OO. 

C. Bose-Chaudhuri-Hocquengham Codes 

Let a be a primitive element of F 2 ™.. Recall that a binary 
BCH code is primitive if its blocklength is of the form 2 m — 1, 
and narrow-sense if the roots of its generator polynomial 
include consecutive powers of a primitive element starting 
from a. In this article, we consider only primitive narrow- 
sense BCH codes and we follow closely the treatment of BCH 
codes in |[ 66 l . 

For integers v, m with 1 < v < 2 m — 1, let /(to, v) be the 
polynomial of lowest-degree over F 2 that has the roots 

a, a 2 ,..., a v . 

Then, BCH(r), to) is defined to be the binary cyclic code 
with the generator polynomial f(m,v). This is precisely the 
primitive narrow-sense BCH code with length 2 m — 1 and 
designed distance v + 1 . 

The dimension K of the cyclic code is determined by the 
degree of the generator polynomial according to |[ 66 l Theorem 

4.2.1] 

K = N — degree(/(m, i>)). 
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Moreover, the minimum distance d m j n of BCH(u,m) is at 
least v + 1 166 Theorem 5.1.1], 

Since ITis the splitting field of the polynomial x N — 1 lf66l 
Theorem 3.3.2], it is easy to see that degree(/(m, 2 m — 1)) = 
N. Also, since the size of the cyclotomic coset of any element 
a 1 is at most to f 66 l Section 3.7], we have degree(/(m, 1)) < 
to, 


0 < degree(/(m, v + 1 )) — degree(/(m, v )) < to. 
Thus, for any r £ (0,1), one can choose v rn such that 
iV(l — r) < degree(/(m, v m )) < N(1 — r) + m. 


Now, it is easy to see that v m > N(1 — r)/m and the rate of 
the code BCH(ri m ,TO) will be in [r — 

Now, consider the length-2 m extended BCH code, 
eBCH(v, to), which is formed by adding a single parity bit to 
the code BCH(u, m) so that overall codeword parity is always 
even ll66l Section 5.1]. The code eBCH(u,TO) has the same 
dimension as BCH(t\ to) and a minimum distance of at least 
v + 1. 

Thus, for any r £ (0,1), there exists a sequence of codes 
{eBCH(u m , to)} with blocklengths N m = 2 m , rates r m —> r 
and minimum distances 

a)nin > 1 + V m > H-• (12) 

An important property of the extended BCH codes is 
that they are affine-invariant Il66l Theorem 5.1.9]. Thus, 
Section IIV-AI shows that their permutation group is doubly 
transitive. Therefore, we have the following theorem. 

Theorem 24: For any r £ (0,1), there is a sequence {v rn } 
such that the code sequence {eBCH(u m , to)} has r m —> r and 
is capacity achieving on the BEC under bit-MAP decoding. 


In the following, we discuss the block erasure probability 
of BCH codes. It is possible to characterize the permutation 
group of the code eBGH(w, to) precisely. According to (67), 
ED, except in sporadic cases, the permutation group of the 
code eBCH(u, to) is equal to the affine semi-linear group. 
Unfortunately, in the framework of m, this group does 
not produce any factors beyond order log (A). This is not 
encouraging for the analysis of block erasure probability. This 
is in contrast with Reed-Muller codes where it was possible 
to exploit GL(m, F 2 ) symmetry to analyze their block erasure 
probability. It is worth noting that the only cyclic primitive 
codes, whose permutation group includes the general linear 
group of degree to, are variants of generalized Reed-Muller 
codes |f771 . 

For BCH codes, however, the minimum distance is large 
enough to use Theorem [18] In fact, the minimum distance of 
the code eBCH(u m ,rn) from (fl2l> satisfies 


lim 

m—f 00 


log d^ m ' > 

o mm 

log 


= 1 . 


(13) 


Since the permutation group of the code eBCH, to) is 
doubly transitive from affine-invariance, by Theorem [16] and 
the proof of Theorem [IT] its average EXIT function satisfies 


the hypothesis of Theorem [18] Combining this observation 
with (fT3] > gives the following result. 

Theorem 25: For any r £ (0, 1), there is a sequence {v m } 
such that the code sequence {eBCH(u m , to)} has r m — > r and 
is capacity achieving on the BEC under block-MAP decoding. 

Corollary 26: For any r £ (0, 1), there is a sequence {v rn } 
such that the code sequence {BCH(u m ,TO)} has r m —> r and 
is capacity achieving on the BEC under both bit-MAP and 
block-MAP decoding. 

Proof: The code BCH(to, v ) can be constructed from 
the code eBCH(TO, v) simply by puncturing (i.e., erasing) the 
overall parity bit. This implies that their EXIT functions satisfy 
ft,eBCH (p) > ph BCU (p). From this, we see that 

h BCB (p) < -h eBGB (p). 

V 

Since puncturing single bit has an asymptotically negligible 
effect on the rate, the statement of the corollary follows 
directly from Theorems [24] and [25] ■ 

Remark 27: Corollary [26] shows that there are sequences of 
binary cyclic codes that achieve capacity on the BEC. As far 
as the authors know, this is the first proof that such a sequence 
exists 1791 . 


V. Discussion 

A. Comparison with the Work of Tillich and Zemor 

Our initial attempts to prove a sharp threshold for EXIT 
functions focused on analyzing (fTOl with Q = O,. In partic¬ 
ular, our aim was to generalize l62l to EXIT functions by 
finding a lower bound on gn^x) that holds uniformly over 
the boundary 

dCli = {x£ { 0 , 1 } W | gn, ( x) > 0 }. 

For code sequences where d m ; n —> 00 and the minimum 
distance of the dual code satisfies d^ in — > 00 , we expected 
that imiix^dQi gni{x) would grow without bound and, thus, 
that the EXIT function would have a sharp threshold. Un¬ 
fortunately, this is not true. In fact, the ensemble of (j, 
regular LDPC codes provides a counterexample. With high 
probability, their minimum distance grows linearly with N 
but one iteration of iterative decoding shows that the EXIT 
function is upper bounded by (1 — (1 — p) fe_1 )' J for all p and 

n ED. 

To understand this, first recall that a weight-c! codeword in 
the dual code defines a subset of d code bits that sum to 0. If 
only one of the bits in this dual codeword is erased, then that 
bit can be recovered indirectly from the other bits. To see this 
in terms of the boundary, consider the indirect recovery of bit- 
i and assume that it is contained in a weight-c/ dual codeword 
with d = > 3. Let x be an erasure pattern where d— 2 of 

the d— 1 other bits in the dual codeword are received correctly 
and all other bits are erased. Then, x £ U, and bit-/ cannot 
be recovered indirectly. Also, bit-/ can be recovered indirectly 
if the erased bit (say bit j) in the dual codeword is revealed. 
Thus, x ^ fli. 
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Now, let us consider ga, (x). If there is any other bit (say bit 
/.:) for which x <k> (f 0 ? , then the pattern of correctly received 
symbols in x ^ (along with bit i) must cover a dual codeword. 
Since :rr k] contains exactly d— 1 zero (i.e., unerased) symbols 
and the minimum dual distance is d, it follows that x^> must 
be a dual codeword. Due to linearity, one can add the two 
vectors to get x^ +gf k \ which clearly has weight 2. However, 
this contradicts the assumption that the minimum dual distance 
is d^ lin > 3. Thus, we find that only bit j is pivotal for x and 

min gn^x) = 1 . 

gzfzoili 

This shows that the method of [62] does not extend auto¬ 
matically to prove sharp thresholds for EXIT functions. While 
it is possible that there is a simple modification that overcomes 
this issue, we did not find it. 

B. Conditions of Theorem E3 

One natural question is whether or not the conditions of 
Theorem [IT] can be weakened. If the permutation groups of 
the codes in the sequence are not transitive, then different bits 
may have different EXIT functions with phase transitions at 
different values of p (e.g., if some of the bits are protected by 
a random code of one rate and other bits with a random code 
of a different rate). 

Even if the permutation groups are transitive, things can 
still go wrong. Consider any sequence of codes with transitive 
permutation groups and increasing length. Let {d^ n } be the 
sequence of minimum distances. Then, symmetry implies that 
the erasure rate of bit-MAP decoding is lower bounded by 
p d mm for a BEC(p) (e.g., every code bit is covered by a 
codeword with weight d m i n ). Thus, the sequence does not 
achieve capacity if has a uniform upper bound. Based on 
duality, a similar argument holds if the sequence of minimum 
dual distances is upper bounded. Thus, to achieve 

capacity, a necessary condition is that —> oo and 

00 • Based on this observation, we make the following 
optimistic conjecture. 

Conjecture 28: Let { C n } be a sequence of binary linear codes 
where the blocklengths satisfy N n —> oo, the rates satisfy 
r n —> r for r £ ( 0 , 1 ), and the permutation group of each code 
is transitive. If the sequence of minimum distances satisfies 
d min —> oo and the sequence of minimum dual distances 
satisfies —> oo, then the sequence achieves capacity on 

the BEC under bit-MAP decoding. 

We call a code reducible if it can be written as the direct 
product of irreducible component codes of shorter length. 
If a code is reducible, then the minimum distance of each 
irreducible component is at least as large as the minimum 
distance of the overall code. Likewise, if the permutation 
group of a reducible code is transitive, then permutation group 
of each irreducible component code must also be transitive. 
Moreover, transitivity implies that the EXIT function of each 
bit must equal both the EXIT function of the overall code and 
the EXIT function of any irreducible component code. Thus, 
the rate of the overall code and the rate of each irreducible 


component code must all be equal to the integral of their 
common EXIT function. This implies that, if the overall 
code satisfies the necessary conditions of the conjecture, then 
each of its irreducible component codes must also satisfy 
the necessary conditions. Thus, it is sufficient to resolve the 
conjecture for the case where there is a single irreducible 
component code. 

C. Beyond the Erasure Channel 

Beyond the erasure channel, this work also has implications 
for the decoding of Reed-Muller codes transmitted over the 
binary symmetric channel. In particular, the results of |[25l 
Theorem 1.8] show that an error pattern can be corrected 
by RM(m — (2 1 + 2),m) whenever an erasure pattern with 
the same support can be corrected by RM(m — (f + 1 ),m). 
Such error patterns can even be corrected efficiently ll35l . 

Another interesting open question is whether or not one can 
extend this approach to binary-input memoryless symmetric 
channels via generalized EXIT (GEXIT) functions l80l . For 
this, some new ideas will certainly be required because the 
straightforward approach leads to the analysis of functions that 
are neither boolean nor monotonic. 

It would also be very interesting to find boolean functions 
outside of coding theory where area theorems can be used to 
pinpoint sharp thresholds. 

D. F q -Linear Codes over the q-ary Erasure Channel 

While our exposition focuses on binary linear codes over 
the BEC, it is easy to extend all results to F g -linear codes 
over the q- ary erasure channel. 

First, the set Q, is redefined to be the set of erasure patterns 
that prevent indirect recovery of the symbol X t . Importantly, 
Q,; is still a set of binary sequences (equivalently, set of 
subsets of [A r ]\{i}), and not a set of sequences over the 
alphabet {0,1,..., q — 1}. Note that, if indirect recovery is 
not possible, then the linearity of the code implies that the 
posterior marginal of symbol i given the extrinsic observations 
is Pr(X ?; = x| Y_~ i = y .) = 1/q. Next, we rescale the 
logarithm in the entropy H(-) to base q so that H(Xi\Y^, i = 
y ) = 1 when indirect recovery of X, is not possible. 

Thus, the sharp threshold framework for monotone boolean 
functions can be applied without change. With these straight¬ 
forward modifications, the results in Sections [TT] and [III] hold 
true verbatim. 

The concept of affine-invariance also extends naturally to 
F q -linear codes of length q m over the Galois field F g . Sim¬ 
ilarly, affine-invariance implies that the permutation group is 
doubly transitive. Thus, sequences of affine-invariant F q -linear 
codes of increasing length, whose rates converge to r £ (0,1), 
achieve capacity over the q- ary erasure channel under symbol- 
MAP decoding. The results for the block-MAP decoder also 
extend without change. Thus, one finds that Generalized Reed- 
Muller codes fll2l and extended primitive narrow-sense BCH 
codes over ¥ t] achieve capacity on the r/-ary erasure channel 
under block-MAP decoding. 
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E. Rates Converging to Zero 

Consider a sequence of Reed-Muller codes {RM(« m ,m)} 
where the rate r m —> 0 sufficiently fast. A key result of l25l is 
that Reed-Muller codes are capacity achieving in this scenario. 
That is, for any S > 0, 

P B X \vm) -» o for any 0 < p m < 1 — (1 + 5)r m . 


Looking closely at f25l Corollary 5.1], it appears that r m = 
0(N ~ K ) for some k > 0 is a necessary condition for this 
result, where the blocklength N m = 2 m . 

Let’s analyze the bit erasure probability using our method. 
From the proof of Theorem [19] it is possible to deduce 
that Ph"'\vE,,,) —> 0 if we choose e m = o(l) such that 
log(l/e m ) = o(\og(N m )). 

We can also obtain a lower bound on p Em . From the proof 
of Proposition [TO] we gather that 

Pe m > 1 - YZZ -(Pl-Em -Pe m )- 

J- E-m 

From Theorem [16] and the proof of Theorem Q3 

< 21og ^ 

Pi Pe m _ log(jVm _ , 

which implies that 


Peru > 1 - 

where 

Therefore, 


2 log-3- 


1 - £m \og(N m - 1) 

, _ £m . 21 °g^ 

Om — 


— 1 (1 H - )^mi 


1 -£m r m log(iV m — 1) 


Pb 7n \pm ) o for any 0 < < 1 - (1 + 6 m )r m , 


for any e m = o(l) such that log(l/e m ) = o(\og(N m )). 

In order to obtain a capacity achieving result under bit-MAP 
decoding, we require that 5 m —»• 0. This can be guaranteed 
if r m log(N m ) —> oo. Under this condition, we can choose 

Em = 1/ log(r m \og(N m )) SO that 


0 , 


log- 


log(iV m ) 


—>• 0 , 


Thus, under the condition r m log(AT m ) —> oo, the sequence 
RM(u m ,m) achieves capacity on the BEC under bit-MAP 
decoding. 

For r m — > 0, our results require r m log(JV m ) —> oo while 
the results in l l25l Corollary 5.1] require r m = 0(N ~ K ) for 
some n > 0. Thus, the results in the two papers apply to two 
asymptotic rate regimes that are non-overlapping. 


VI. Conclusion 

We show that a sequence of binary linear codes achieves 
capacity if its blocklengths are strictly increasing, its code rates 
converge to some r € (0,1), and the permutation group of 
each code is doubly transitive. To do this, we use isoperimetric 
inequalities for monotone boolean functions to exploit the 
symmetry of the codes. This approach was successful largely 


because the transition point of the limiting EXIT function for 
the capacity-achieving codes is known a priori due to the 
area theorem. One remarkable aspect of this method is its 
simplicity. In particular, this approach does not rely on the 
precise structure of the code. 

The main result extends naturally to F 9 -linear codes trans¬ 
mitted over a q -ary erasure channel under symbol-MAP decod¬ 
ing. The class of affine-invariant F 9 -linear codes also achieve 
capacity, since their permutation group is doubly transitive. 
Our results also show that Generalized Reed-Muller codes and 
extended primitive narrow-sense BCH codes achieve capacity 
on the r/-ary erasure channel under block-MAP decoding. 
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Appendix I 

Proof of Proposition ITol 

51 <*==> S2: First, recall from ([3]) that Pb{p) = ph(p). From 
this, it follows that S 2 => 51. Now, consider 51 => 52. 
The relation Pb{p) = ph(p) together with P^ n \p) —f 0 and 
h>)(0) = 0 implies 

lim hS n \p) = 0 for 0 < p < 1 — r. 


Now, we focus on the limit of h^ n \p) for 1 — r < p < 1. 
Fix q G (1 — r, 1] and choose no large enough so that, for all 
n > no, we have r n > r — £ and h^ (1 — r — e) < e. Such an 
no exists because r n —> r and (p) —> 0 for 0 < p < 1 — r. 
Since the function h''"' 1 is increasing for all n, the EXIT area 
theorem (i.e.. Proposition |4]c)) implies that, for all n > no, 
we have 


r — £ < r n = f h^ n \p)dp 
Jo 

= f h (n \p)dp+ r h {n \p)dp 
Jo Jl-r-e 

+ [ h( n \p)dp 

Jq 

< (1 — r — e)e + (q — (1 — r) + £)hJ n \q) 

+ (!-?)■ 


This implies 

h M (a) > g-(l-r)-g(2-r-£) 

[q> ~ q — (1 — r) + £ 

2 e 

~ Q ~ (1 - r )' 

As such, linin-^oo M") (q) = 1, for any 1 — r <q< 1. 

52 => 53: Since p\_ E — pk^ is the width of the erasure 
probability interval over which h <n ' 1 transitions from e to 1 —e, 
this follows immediately from 52. 
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S3 =>- S 2: It suffices to show that, for any e £ (0,1/2], 
lim p( n ) = lim pf/ e = 1 — r. 

n—¥ oo n—¥ oo 

From Proposition |4}c), we have 


r n = J h (n \a)da < ep^ + ^1 - pl"^ , 


which implies pi"** < \_f " . Similarly, 


J h^(a)da > ^1 — (1 — e), 


which implies 


(n) . 1 Pn. £ 

Pl-e > 


1 — £ 


Combining these gives 


1 - r n - e 


+ (pi n) - Pi- £ ) < Pe sa 


< 


1 - r„ 


1 r ” £ < Mi < V-4 + (Mi - M>). 


1 — £ 

From the hypothesis, 

1 — r — £ 


1 — e 


< limsupp^ < 


1 — r 

1 £ n—f oo 1 £ 

1 — r — £ ( n ) 1 — r 

< limsupp^_ £ < 


1 — £ 


1 — £ 


Thus 


t —>-0 


lim sup p/)/ e < lim limsuppi"\ = 1 — r. 

n—f oo tt,—^ oo 

Since pi "" 1 < p^ e , we deduce that 

lim sup p^ = limsuppi"* E = 1 — r. 

n—f oo n—>-oo 

Repeating this exercise with lim sup replaced by lim inf also 
gives the result 1 — r. Thus, for any e £ (0,1/2], we have 

lim p( n ) = lim pi"** e = 1 — r. 


dp 


> wh(p)( 1 - 6 (p)). 


If Pt = 6 1 (f), then for 0 < E\ < £2 < 1, 


Pe 2 ~Ps 1 < a + (1-6)4- 


1 


log 


£2 

1 £2 


■log 


1 £1 


£1 


Moreover, for 0 < S < P 1 / 2 , 

h{5) < exp [-w ([pi /2 - (5] - [a + 1 - 6 ])] . 

Proof: Let g(p) = log anc ^ observe that, for a < 

p < 6 , we have 

dg(p) 


-h(p) 

1 dh(p) 


> w. 


dp h(p)(l-h(p)) dp 

Let p t = 6 _ 1 (i). We would like to obtain an upper bound on 
Pe 2 - Pe 1 by integrating dg/dp. 

If a < p El < p E2 < 6 , then integrating dg/dp from p ei to 
Pe 2 gives 


>(Pe 2 ~Pe 1 ) < [ -J-dp = log -log 

Jp e , ap 1 — £2 


£1 

1 —£1 1 


which immediately shows (IT4l) . 

Suppose p ei < a < p e2 < b, and note that since g is 
increasing £\ = p(p El ) < g(a). Then, integrating dgi/dp from 
a to p e2 gives 

rPe2 dg 


fP 

w{p £2 - a) < 

J a 


dp 


dp 


lim lim sup pi"** = lim limsuppi"\ = 1 — r. 

n—> oo t—fO n _^oo 

But pj”* and p[ r /} t are increasing and decreasing functions of 
t, respectively. This gives 

lim sup pi”** > lim lim sup pj"* = 1 — r, 


Appendix II 

Proofs from SectionHIII 

Lemma 29: Suppose h: [0,1] —> [0,1] is a strictly increasing 
function with 6(0) = 0 and 6(1) = 1. Additionally, for 0 < 
a < p < b < 1 , let 

dh{p) 


£2 , h(a) 

= log --log -- 

1 — e 2 1 — 6 (a) 

< log ——-log ———. (Since ei < 6 (a)) 

1 — £2 1 — £1 

Using p E2 — p El < a + (p E2 — a) with the above inequality 
gives d. 

By considering other cases where p El and p E2 lie, it is 
straightforward to obtain (114b . Also, substituting s 2 = 1/2 
and £1 = h~ 1 (5) in (IT4l> gives the desired upper bound on 

h(5). U 

A. Proof of Theorem I Ml 

Let pi™' 1 be the functional inverse of h/ n ' > from (17} . Using 
Lemma |291 with a n = 0 and b n = 1 gives 

„(») _ M < 21 °g 1 7 £ 

Pl - £ ~ C log N n ' 

By hypothesis, N n —> 00 . Thus, for any e £ (0,1/2], we 
have p[ n } e — pi"'* —5> 0. Using this, we apply statement S 2 of 
Proposition [10] to see that Py! 2 —> 1 — r. 

Now, we can choose e n = d^/j (N n log N n ) and observe 
that 

(n) (n) ^ ^ , 1 £n 

Pl-e n ~ PeJ < Ar lo S ■ 


C log N n £ n 

2 N n log N n 

< — -log ■ 


C log N n 


f n ) 


2 log N n + log log N n - log dM 


(n) 


C 


log N n 


■ (14) 


By hypothesis, logd^ n /logiV„ — > 1. Thus, p[ r> J > £n —pi"* —> 0. 
Combining this with pi"^ < P-/% < P\l En shows that pi"'* —>• 
1 — r. 
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Also, from ©, 

P^ipff) = < /i (n) (p^) = e, 


Recall from © that Pb < N p,/d UI \ n . Hence, for any p £ 
[ 0,1 — r), one finds that pff > p for sufficiently large n and 
thereafter 


P { b\p) < 


N n 


_ p 

j(«) b 


(n) 


ip) < 


An 

j(«) 


'Em. — 


log A n 


-A 0. 


Thus, we conclude that {C ra } is capacity achieving on the BEC 
under block-MAP decoding. 


B. Proof of Theorem [79] 

Let p[ n> be the functional inverse of from ©. From 
Lemma [29] 

pf-e - pi n) < a n + (1 - b n ) + 21 ° S e ■ 

w n log N n 

By hypothesis, a n —> 0, 1 — b n —> 0, and w n log N n —> oo. 
Thus, for any e £ (0,1/2], we have p[ n f — —> 0. Using 

this, we apply statement S 2 of Proposition [TO] to see that 

(n) 

Pi/2 

Now, we can choose e n = 1/A/ and observe that 


Pi 


~ Psf — a «. + (l — b n ) + - 


1 


-2 log 


1 s r . 


W n log N n 

< a„ + (l - b n ) - 1 4 log N n 

w n log N n 

4 

= dra + (l — b n )-\ -. 

W n 

Combining p< pff^ < pfl en with pf\ n — Pef —> 0 

(n) 

shows that p\J —> 1 — r. 

Also, from ©, 

n (n) (p^) =p£ ) h {n) U: ) ) < = s n . 

Recall from ([TJ that Pb < A/',. Hence, for any p £ [0,1 — r), 
one finds that pf > p for sufficiently large n and thereafter 


P ( b\p) < N n P^ n \p) < N n e n = N n /Nl 0. 

Thus, we conclude that {C n } is capacity achieving on the BEC 
under block-MAP decoding. 


A") 
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